Search CORE

19 research outputs found

Self-supervised Dimensionality Reduction with Neural Networks and Pseudo-labeling

Author: Espadoto Mateus
Hirata Nina S.T.
Telea Alexandru C.
Publication venue: 'Scitepress'
Publication date: 01/01/2021
Field of study

Dimensionality reduction (DR) is used to explore high-dimensional data in many applications. Deep learning techniques such as autoencoders have been used to provide fast, simple to use, and high-quality DR. However, such methods yield worse visual cluster separation than popular methods such as t-SNE and UMAP. We propose a deep learning DR method called Self-Supervised Network Projection (SSNP) which does DR based on pseudo-labels obtained from clustering. We show that SSNP produces better cluster separation than autoencoders, has out-of-sample, inverse mapping, and clustering capabilities, and is very fast and easy to use.</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Deep Learning Multidimensional Projections

Author: Espadoto Mateus
Hirata Nina S. T.
Telea Alexandru C.
Publication venue
Publication date: 21/02/2019
Field of study

Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ability to visually separate distinct data clusters. However, such methods are computationally expensive for large datasets, suffer from stability problems, and cannot directly handle out-of-sample data. We propose a learning approach to construct such projections. We train a deep neural network based on a collection of samples from a given data universe, and their corresponding projections, and next use the network to infer projections of data from the same, or similar, universes. Our approach generates projections with similar characteristics as the learned ones, is computationally two to three orders of magnitude faster than SNE-class methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique. We demonstrate our proposal on several real-world high dimensional datasets from machine learning

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Utrecht University Repository

Dissertations of the University of Groningen

SDBM: Supervised Decision Boundary Maps for Machine Learning Classifiers

Author: Espadoto Mateus
Hirata Roberto
Oliveira Artur
Telea Alex
Publication venue
Publication date: 01/01/2022
Field of study

Understanding the decision boundaries of a machine learning classifier is key to gain insight on how classifiers work. Recently, a technique called Decision Boundary Map (DBM) was developed to enable the visualization of such boundaries by leveraging direct and inverse projections. However, DBM have scalability issues for creating fine-grained maps, and can generate results that are hard to interpret when the classification problem has many classes. In this paper we propose a new technique called Supervised Decision Boundary Maps (SDBM), which uses a supervised, GPU-accelerated projection technique that solves the original DBM shortcomings. We show through several experiments that SDBM generates results that are much easier to interpret when compared to DBM, is faster and easier to use, while still being generic enough to be used with any type of single-output classifie

Utrecht University Repository

Constructing and Visualizing High-Quality Classifier Decision Boundary Maps dagger

Author: Espadoto Mateus
Hirata Jr Roberto
Rodrigues Francisco C. M.
Telea Alexandru C.
Publication venue: 'MDPI AG'
Publication date: 01/09/2019
Field of study

Visualizing decision boundaries of machine learning classifiers can help in classifier design, testing and fine-tuning. Decision maps are visualization techniques that overcome the key sparsity-related limitation of scatterplots for this task. To increase the trustworthiness of decision map use, we perform an extensive evaluation considering the dimensionality-reduction (DR) projection techniques underlying decision map construction. We extend the visual accuracy of decision maps by proposing additional techniques to suppress errors caused by projection distortions. Additionally, we propose ways to estimate and visually encode the distance-to-decision-boundary in decision maps, thereby enriching the conveyed information. We demonstrate our improvements and the insights that decision maps convey on several real-world datasets

Multidisciplinary Digital Publishing Institute

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Using multiple attribute-based explanations of multidimensional projections to explore high-dimensional data

Author: Espadoto Mateus
Telea Alexandru
Tian Zonglin
van Driel Daan
van Steenpaal Gijs
Zhai Xiaorui
Publication venue: 'Elsevier BV'
Publication date: 01/08/2021
Field of study

Multidimensional projections (MPs) are effective methods for visualizing high-dimensional datasets to find structures in the data like groups of similar points and outliers. The insights obtained from MPs can be amplified by complementing these techniques by several so-called explanatory mechanisms. We present and discuss a set of six such mechanisms that explain MPs in terms of similar dimensions, local dimensionality, and dimension correlations. We implement our explanatory tools using an image-based approach, which is efficient to compute, scales well visually for large and dense MP scatterplots, and can handle any projection technique. We demonstrate how the provided explanatory views can be combined to augment each other's value and thereby lead to refined insights in the data for several high-dimensional datasets, and how these insights correlate with known facts about the data under study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

HyperNP: Interactive Visual Exploration of Multidimensional Projection Hyperparameters

Author: Anderson Erik W
Appleby Gabriel
Chang Remco
Chen Rui
Espadoto Mateus
Goree Samuel
Telea Alexandru
Publication venue
Publication date: 25/06/2021
Field of study

Projection algorithms such as t-SNE or UMAP are useful for the visualization of high dimensional data, but depend on hyperparameters which must be tuned carefully. Unfortunately, iteratively recomputing projections to find the optimal hyperparameter value is computationally intensive and unintuitive due to the stochastic nature of these methods. In this paper we propose HyperNP, a scalable method that allows for real-time interactive hyperparameter exploration of projection methods by training neural network approximations. HyperNP can be trained on a fraction of the total data instances and hyperparameter configurations and can compute projections for new data and hyperparameters at interactive speeds. HyperNP is compact in size and fast to compute, thus allowing it to be embedded in lightweight visualization systems such as web browsers. We evaluate the performance of the HyperNP across three datasets in terms of performance and speed. The results suggest that HyperNP is accurate, scalable, interactive, and appropriate for use in real-world settings

arXiv.org e-Print Archive

Utrecht University Repository

Aprendendo projeções multidimensionais com redes neurais

Author: Espadoto Mateus
Publication venue: 'Universidade de Sao Paulo, Agencia USP de Gestao da Informacao Academica (AGUIA)'
Publication date: 02/09/2022
Field of study

Learning multidimensional projections with neural networksAprendendo projeções multidimensionais com redes neurai

Biblioteca Digital de Teses e Dissertações

Learning Multidimensional Projections with Neural Networks

Author: Espadoto Mateus
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2021
Field of study

In the wake of the revolution brought by Deep Learning, we believe neural networks can be leveraged as a tool in the service of dimensionality reduction (DR) for understanding large datasets with many dimensions (measurements). In this work, we present techniques for DR based on neural networks which improve over existing techniques on criteria such as scalability, dealing with unseen data, cluster separation, and ease of use, to name a few. We also present a quantitative evaluation of popular techniques, and propose novel applications that highlight the importance of DR techniques as tools for high-dimensional data analysis

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Stability Analysis of Supervised Decision Boundary Maps

Author: Espadoto Mateus
Hirata Roberto
Oliveira Artur
Telea Alex
Publication venue
Publication date: 01/05/2023
Field of study

Understanding how a machine learning classifier works is an important task in machine learning engineering. However, doing this is for any classifier in general difficult. We propose to leverage visualization methods for this task. For this, we extend a recent technique called Decision Boundary Map (DBM) which graphically depicts how a classifier partitions its input data space into decision zones separated by decision boundaries. We use a supervised, GPU-accelerated technique that computes bidirectional mappings between the data and projection spaces to solve several shortcomings of DBM, such as accuracy and speed. We present several experiments that show that SDBM generates results which are easier to interpret, far less prone to noise, and compute significantly faster than DBM, while maintaining the genericity and ease of use of DBM for any type of single-output classifier. We also show, in addition to earlier work, that SDBM is stable with respect to various types and amounts of changes of the training set used to construct the visualized classifiers. This property was, to our knowledge, not investigated for any comparable method for visualizing classifier decision maps, and is essential for the deployment of such visualization methods in analyzing real-world classification models

Utrecht University Repository